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[t] Summary 
H 18 

^ New methods and theory have recently been developed to nonparametrically estimate cumu- 

j_j 2Q lative incidence functions for competing risks survival data subject to current status censoring. 

particular, the limiting distribution of the nonparametric maximum likelihood estimator and 

^ ^, -^-7 a simplified "naive estimator" have been established under certain smoothness conditions. In 

22 this paper, we establish the large-sample behavior of these estimators in two additional models, 
CN namely when the observation time distribution has discrete support and when the observation 
^ 25 times are grouped. These asymptotic results are applied to the construction of confidence in- 
ly-^ 2g tervals in the three different models. The methods are illustrated on two data sets regarding 
00 2^ ths cumulative incidence of (i) different types of menopause from a cross-sectional sample of 

23 women in the United States and (ii) subtype-specific HIV infection from a sero-prevalence study 
On Xq in injecting drug users in Thailand. 

o 

On 30 
03I 
>32 

34 1. Introduction 

5^ 35 Current status data with competing risks arise in cross-sectional studies that assess the "cur- 

36 rent status" of individuals in the sample with respect to an event that can be caused by several 

37 mechanisms. An example is Cycle I of the Health Examination Survey in the United States 

38 (MacMahon & Worcestor, 1966). This study recorded the age and menopausal status of the fe- 

39 male participants, where menopausal status could be pre-menopausal, post-menopausal due to 

40 an operation, or post-menopausal due to natural causes. Based on these data, the cumulative 

41 incidence of natural and operative menopause can be estimated as a function of age. A sec- 

42 ond example is the Bangkok Metropolitan Administration injecting drug users cohort study 

43 (Kitayaporn et al., 1998; Vanichseni et al., 2001). This study recorded the age and HIV status 

44 of injecting drug users, where HIV status could be HIV negative, HIV positive with subtype B, 

45 HIV positive with subtype E, or HIV positive with some other subtype. Based on these data, the 

46 subtype-specific cumulative incidence of HIV can be estimated as a function of age. 

47 New methods and theory have recently been developed to nonparametrically estimate cumula- 

48 five incidence functions based on current status data with competing risks. Hudgens et al. (2001) 
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49 and Jewell et al. (2003) derived and studied the nonparametric maximum likelihood and also in- 

50 troduced several other estimators, including the so-called naive estimator of Jewell et al. (2003). 

5 1 Maathuis (2006) and Groeneboom et al. (2008b, c) derived the large-sample behavior of the max- 

52 imum likelihood estimator and the naive estimator in a "smooth model" that imposes certain 

53 smoothness conditions on the cumulative incidence functions and the observation time distribu- 

54 tion. In this model, the local rate of convergence of the maximum likelihood estimator is n^/^ 

55 (Groeneboom et al., 2008b, Theorem 4-17), slower than the usual n^/^ rate. Moreover, its limit- 

56 ing distribution is non-standard and involves a self-induced system of slopes of convex minorants 

57 of Brownian motion processes plus parabolic drifts (Groeneboom et al., 2008c, Theorems 1-7 

58 and 1-8). The naive estimator has the same local rate of convergence as the maximum likelihood 

59 estimator, but its limiting distribution is simpler, since it does not involve a self induced system 

60 (Groeneboom et al., 2008c, Theorem 1-6). 

61 In practice, recorded observation times are often discrete, making the smooth model unsuit- 

62 able. We therefore study the large sample behavior of the maximum likelihood estimator and the 

63 naive estimator in two additional models: a "discrete model" in which the observation time distri- 

64 bution has discrete support, and a "grouped model" in which the observation times are assumed 

65 to be rounded in the recording process, yielding grouped observation times. 

66 We show that the large sample behavior of the estimators in the discrete model is fundamen- 

67 tally different from that in the smooth model: the maximum likelihood estimator and the naive 

68 estimator converge locally at rate n^/^, and their limiting distributions are identical and normal. 

69 These results are related to the work of Yu et al. (1998), who studied the asymptotic behavior 

70 of the maximum likelihood estimator for current status data with discrete observation times in 

71 the absence of competing risks. There are also connections to unpublished work of Tang, Baner- 

72 jee and Kosorok, who studied the limiting distribution of the maximum likelihood estimator for 

73 current status data when the observation times fall on a grid that depends on the sample size. 

74 The grouped model is related to the work of Woodroofe & Zhang (1999) and Zhang et al. 

75 (2001), who considered the maximum likelihood estimator for a nondecreasing density when the 

76 observations are grouped. We are not aware, however, of any work on the maximum likelihood 

77 estimator for interval censored data with grouped observation times, even though such grouping 

78 frequently occurs in practice. For example, in the menopause data the ages of the women were 

79 grouped in the intervals (25, 30], (30, 35], (35, 36], (36, 37], . . . , (58, 59] and recorded as the 

80 midpoints of these intervals. The menopausal status, on the other hand, was determined at the 

81 exact but unrecorded time of interview, yielding a mismatch between the recorded status and 

82 the recorded observation time. For example, if a 30.7 year old pre-menopausal woman is inter- 

83 viewed, she is recorded as pre-menopausal with rounded age 32.5. When ignoring the rounding, 

84 as done in previous analyses of these data, this is taken to mean that she was interviewed at 

85 age 32.5 and that she was pre-menopausal at that age. A correct interpretation of the data is, 

86 however, that she was pre-menopausal at some unknown age in the interval (30, 35]. In partic- 

87 ular, the data do not reveal her menopausal status at age 32.5; in actuality, she might have been 

88 post-menopausal at that age, for example due to an operation. 

89 The grouped model accounts for such grouping of observation times. We show that the like- 

90 lihood in this model can be written in the same form as in the discrete model, but in terms of 

9 1 different parameters, representing weighted averages of the cumulative incidence functions over 

92 the grouping intervals, where the weights are determined by the observation time distribution. 

93 This similarity with the discrete model implies that the maximum likelihood estimator and the 

94 naive estimator in the grouped model can be computed with existing software, and that their 

95 limiting distributions can be derived as in the discrete model. However, since the likelihood is 

96 written in terms of different parameters, the estimates under the grouped model must be inter- 
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97 preted differently. The ideas incorporated in the grouped model can be easily extended to other 

98 forms of interval censored data. 

99 The asymptotic results in the three models are applied to the construction of confidence inter- 

100 vals, a problem that has received little attention until now. In the discrete and grouped models, 

101 confidence intervals can be constructed by standard methods, for example using the bootstrap or 

102 the limiting distributions derived in this paper. In the smooth model, the non-standard limiting 

103 behavior of the estimators makes the construction of confidence intervals less straightforward. 

104 In this case, we advocate using likelihood ratio confidence intervals (Banerjee & Wellner, 2001) 

105 based on the naive estimator. 
106 

107 

108 2. Models 

109 2-1. Exact observation times 
Consider the usual competing risks setting where an event can be caused by K competing 

risks, with € {1, 2, ... } fixed. The random variables of interest are (X, Y), where X € M is 
the time of the event of interest, and y € {1, . . . , K} is the corresponding cause. The goal is 
to estimate the cumulative incidence functions Fq = (Fqi, . . . , Fqk), where Fok{t) = pr{X < 
t,Y = k) for k = 1, K. The cumulative incidence functions are non-negative, monotone 
non-decreasing, and satisfy Yl^=i ^okii) = pr{X < t) < 1. 
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The difficulty in estimating the cumulative incidence functions is that we cannot observe 
{X, Y) directly. Rather, we observe the "current status" of a subject at a single random observa- 
tion time C € M. Thus, at time C we observe whether or not the event of interest has occurred, 
and if and only if the event has occurred, we also observe the cause Y. We assume C is inde- 
pendent of {X, Y). Let G denote the distribution of C, and let (C, A) denote the observed data, 
where A = (Ai, . . . , A^'+i) is an indicator vector for the status of the subject at time C: 



123 Afc = 1{X < C,y = /c} for /c = and Ai^+i = 1{X > C}. (I) 
124 

125 To make this concrete, consider the HIV data discussed in Section 1, where X is the age at HIV 

126 infection, C is the age at screening, and there are A' = 3 competing risks representing the HIV 

127 subtypes: y = 1 for subtype B, y = 2 for subtype E, and y = 3 for other subtypes. 

128 We consider the maximum likelihood estimator for Fq based on n i.i.d. observations of (C, A), 

129 denoted by (Cj, A*), i = I, . . . ,n, where A* = (A^^, . . . , A^^^^). For any A'-tuple {xi, . . . , xk) 

130 let x+ = Ylk=i and, unless otherwise defined, let xk+i = 1 — Moreover, define the set 

131 = {F = (Fi, ...,Fk) : Fi, . . . ,Fk are cumulative incidence functions and F^{t) < 1 for 

132 all t G M}. A maximum likelihood estimator for Fq is defined as any F„ = (F^i, . . . , FnK) € 

satisfying ln{Fn) = maxi^gj-^ ln{F), where /„(F) is the log likelihood 

135 1 " K+i 

136 ln{F) = -^Yl AUog{Ffc(C,)}, (2) 

137 i=l k=l 



138 
139 
140 
141 

142 1 " 

143 InkiFk) = -^[^1 ^og{FkiC\)} + (1 - Al) log{l - FkiC)}] (3) 

144 " i=i 



with the convention log = 0; see also Jewell et al. (2003), equation (1). 

We also consider the naive estimator F„ = {Fni, ■ ■ ■ , Fuk) of Jewell et al. (2003), whose A;th 
component is defined as any Fnk G Fi satisfying lnk{Fnk) = t^SlXf^gTi lnk{Fk), where 
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145 is the marginal log likelihood for the reduced current status data (Cj, A|,), i = 1, . . . ,n, and Ti is 

146 obtained from by taking K = 1. Since Fnk only uses the A:th entry of the A-vector, the naive 

147 estimator splits the estimation problem into K well-known univariate current status problems. 

148 Therefore, its computation and asymptotic theory follow straightforwardly from known results 

149 on current status data. But this simplification comes at a certain cost. For example, F„+ need not 
be bounded by one, and the naive estimator has been empirically shown to be less efficient than 

15 1 the maximum likelihood estimator in the smooth model (Groeneboom et al., 2008c). 

152 xhe R-package MLEcens provides an efficient and stable method to compute the maximum 

153 likelihood estimator. This algorithm first uses the Height Map Algorithm of Maathuis (2005) 

154 to compute the areas to which the maximum likelihood estimator can possibly assign proba- 

155 bility mass, called maximal intersections. Next, it computes the amounts of mass that must be 

156 assigned to the maximal intersections. This involves solving a high-dimensional convex opti- 

157 mization problem, which is done using the support reduction algorithm of Groeneboom et al. 

158 (2008a). Jewell & Kalbfleisch (2004) describe an alternative algorithm for the computation of 

159 the MLE, based on the pool adjacent violators algorithm of Ayer et al. (1955). 

160 The maximum likelihood estimator and the naive estimator are not defined uniquely at all 

161 times. Gentleman & Vandal (2002) defined two types of non-uniqueness for estimators based 

162 on censored data: mixture non-uniqueness and representational non-uniqueness. Mixture non- 
163 uniqueness occurs when the probability masses assigned to the maximal intersections are non- 
164 unique. Representational non-uniqueness refers to the fact that the estimator is indifferent to the 

165 distribution of mass within the maximal intersections. The maximum likelihood estimator for 

166 current status data with competing risks is always mixture unique (Maathuis (2006, Theorem 

167 2-20)), and mixture uniqueness of the naive estimator follows as a special case of this. One can 

168 account for representational non-uniqueness of the estimators by providing a lower bound that 

169 assigns all mass to the right endpoints of the maximal intersections, and an upper bound that 

170 assigns aU mass to the left endpoints of the maximal intersections. 
171 

172 2-2. Exact observation times with discrete support 

173 Section 21 does not impose any assumptions on the observation time distribution G, and 

174 hence is valid for both continuous and discrete observation times. However, the formulas can be 

175 simplified when G is discrete. In this case, let G{{s}) denote the point mass of G at s, and let 

176 5 = {s € R : G{{s}) > 0} denote the support of G, where S is countable but possibly infinite. 

177 Defining 



1 

1"^^ Nk{s) = -y2Ail{Gi = s}, k = l,...,K + l, seS, 



1=1 



K+l 



178 
179 

180 n 
181 

182 and N{s) = Yjk=i {s), the log likelihood (2) reduces to 
183 
184 
185 
186 

187 and the marginal log likelihood (3) for the naive estimator becomes 

188 

189 ^nk{Fk) = [Nk{s) log{Ffc(s)} + {N{s) - Nk{s)}\og{l - Fk{s)}] . 

190 '"^^ 

191 The spaces Tk and Ti can also be simplified, as the nonnegativity, monotonicity and bounded- 

192 ness constraints only need to hold at points s € S. 



^«(^) = E E ^'^(«) ^^s{Fk{s)}, (4) 



sGcS fc=l 
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[ pr{D = d,/^ = 5\C = c)dG{c) = [ pr{A = 6\C = c)dG{c 



193 2-3. Grouped observation times 

194 In many applications, only rounded versions of the observation times are recorded, yielding 

195 grouped observation times. We introduce a new model for this type of data, called the grouped 

196 model. For any interval / on the real line, define G{I) = J^^^dG{c). Let X be a countable 

197 but possibly infinite set of mutually exclusive intervals such that G{I) > for all 1^1. For 

198 each / G I, let m(/) denote a unique point in the interval, for example its midpoint, and let 

199 tVJ = {m{I) € M : / G I}. For each m € 7W, let /(m) denote the corresponding interval in I. 

200 The observation scheme in the grouped model is as follows. As before, the current status of 

201 a subject is assessed at a single random time C G M, where C is independent of {X,Y). The 

202 difference is, however, that we no longer observe C. Instead, all observation times falling into 

203 interval / are grouped and rounded to m{I). Thus, the observed data are (D, A), where D = 

204 Yliex m{I)l{C G /} is the rounded version of C, and A is the indicator vector corresponding to 

205 the status of the subject at the exact time C, as defined in (1). We study the maximum likelihood 

206 estimator and the naive estimator based on n i.i.d. observations of {D, A), which we denote by 

207 (A,AO,i = l,...,n. 

208 To derive the likelihood in the grouped model, we compute pr{D = d, A = 6) for d G and 

209 S G {ei, . . . , ck+i}, where is the unit vector in M^^^ with a 1 at the kth entry. Conditioning 

210 on the exact observation time C yields 
211 

2^2 pr{D = d,A = 6) 

213 
214 

215 K+i r . K+i 

= n / i'ok{c)dG{c) = G{m} n [Hokim}]'- , (5) 

217 [Jcei{d) J 
218 
219 
220 
221 
222 

223 and Ho^K+i{I{d)} = 1 — HQj^{I{d)} are weighted averages of Fqi, . . . , Fq^k+i over I{d) with 

224 weights determined by G. It is convenient to work with these weighted averages, as they must 
22^ obey the same constraints as the cumulative incidence functions. More precisely, considering 
22g H(jk, k = I, . . . , K, as functions that maps m to HQk{I{'m)}, the constraints on Fqi, . . . , Fqk 

imply that i^oi, • • • , Hqk must be non-negative and non-decreasing and satisfy HQ^{I{'m)} < 1 

22 g for allm e M. Let Hk denote the space of such allowable i^-tuples {Hi,. . . , Hk)- 

22g The term G{I{d)} in the right hand side of (5) can be dropped from the likelihood, as it does 

22Q not depend on F. Hence, a maximum likelihood estimator for Hq = {Hqi, . . . , Hqk) is defined 

231 as any Hn G satisfying C°'''^{Hn) = uiayiHeHK ln°''^{H), where 

232 n K+l 

233 /r""(F) = -Y.Y.^i iog[H,{m)}]. (6) 

234 
235 
236 
237 

238 ^+1 

239 = Y.Y1 ^'^(^) log{^fc(^)}' (7) 

240 ^62: k=i 



where 



Hok{I{d)} = [G{I{d)}]-^ [ Fok{c)dG{c), k = l,...,K 

Jcei(d) 



n 

i=l k=l 



Expression (6) has the same form as (2), but with Fk{Ci) replaced by the weighted average 
Hk{I{Di)}. As in the discrete model, (6) can be simplified further: 
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241 where 

242 . n 

243 Mk{I) = -^Ail{Di=m{I)}, k = 1, . . . , K + 1, I £ I. 

244 i=i 
945 

Since the log likelihood (7) has the same form as (4), and also the constraints on the maxi- 
mization problems for the discrete and grouped models are equivalent, the maximum likelihood 
estimator in the grouped model can be computed with existing software. Moreover, its asymp- 
totic theory follows straightforwardly from the theory for the discrete model. The important 
difference between the two models is, however, that the resulting estimates must be interpreted 
differently. In the discrete model, one estimates the cumulative incidence functions at points 
^ s G 5. In the grouped model, the cumulative incidence functions are unidentifiable in general. 



247 
248 
249 
250 



and one estimates the weighted averages of the cumulative incidence functions over intervals 
253 jeJ. 

The naive estimator Hn in the grouped model can be derived analogously. Defining M (!) = 
"12^=1 Mk{I), I £ I, the marginal log likelihood for the kth component is 

256 

257 iZr'iHk) = [Mk{I) log{Hk{I)} + {M{I) - Mk{I)} log{l - ^^(1)}] , (8) 

258 

259 , , 

260 and H^k G Tii is defined by l^^^^{Hnk) = maxj^^g^^ lf[°'^^{Hk). 

2^^ Remark 1. In general, FQk{m) ^ HQk{I{m)), but equality can occur in special situations. 

262 example, FQk{m) = HQj^{I{m)) if (i) Fq^ is constant on I{m), or (ii) both i^ofc <^>^d G are 

2^3 linear on I{m) and m is the midpoint of I{m), or (Hi) the only mass ofG on I{m) consists of a 

2^4 point mass at m. The latter shows that the grouped model generalizes the discrete model. 

265 

266 

267 3. Local asymptotics of the estimators 

3-1. Strong consistency in the discrete and grouped models 
The maximum likelihood estimator and the naive estimator are Hellinger consistent when the 
2y ^ observation times are recorded exactly, for any observation time distribution G (Maathuis, 2006, 

Theorem 4-6). Using the equivalence between Hellinger distance and total variation distance, 
2^2 this implies consistency in total variation (Maathuis, 2006, Corollary 4-7), which in turn implies 

2^^ strong pointwise consistency at all points s G 5 in the discrete model, as stated in Theorem 1. 

275 Theorem 1. (Maathuis, 2006, Corollary 4-9) In the discrete model, -F„a,.(s) — )■ Fofc(s) and 

276 Fnk{s) — )• FQk{s) almost surely as n ^ oo for all s £ S. 
Ill 

2Yg Since the form of the log likelihood and the constraints on the allowable functions are identical 

2YQ in the discrete and grouped models, the proofs for the discrete model carry over directly to the 

2gQ grouped model. This leads to Theorem 2, which we give without proof. 

281 Theorem 2. In the grouped model, Hnk{I) ~^ ^ofc(-^) cind H^kil) H^^il) almost surely 

282 Qs n ^ CO for all I G X. 
283 

284 3-2. Limiting distributions in the discrete model 

285 Denote the infimum and supremum of S by Sj^f and Ssup, and define 

s- = sup{x G 5 : a; < s} for s e 5, s / Smf, 
2§§ s+ = inf{x G 5 : X > s} for s E 5, s 7^ Ssup- 
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289 Define s € 5 to be a regular point if FQk{s) = for all A; = 1, . . . , or the following two con- 

290 ditions hold: (i) if s / Sinf then s_ G 5 and for each k = 1, . . . , K either Fofc(s_) < Fok{s) or 

291 Fok{s) = and (ii) if s 7^ Ssup then s+ E 5 and for each k = 1, . . . , K either Fofe(s) < Fofc(s_|_) 

292 or Fok{s) = 0. If 5 is a finite set and s £ S \ {s-mi, Ssup}, then s_ and s+ are simply the 

293 points directly to the left and right of s, and conditions (i) and (ii) are equivalent to requir- 

294 ing that for each k = 1, . . . , K either Fofc(s_) < Fqi^{s) < Fojt(s+) or Fofc(s) = 0. As a sec- 

295 ond example, suppose that S is the set of rational numbers. Then for any point s € 5 we have 

296 s ^ {sinf , Ssup} and s_ = s = s+. Hence, conditions (i) and (ii) are only satisfied if -Fofc(s) = 

297 for all A: = 1, . . . , i^. Yu et al. (1998) introduced regular points in the current status model with- 

298 out competing risks. Our definition generalizes theirs by allowing for competing risks. Moreover, 

299 we allow the parameters to be on the boundary of the parameter space. For example, s € 5 can 

300 be a regular point when Fq^^s) = for some or all of the Fq^'s, and s = Ssup can be a regular 

301 point when X^^j^ FQk{s) = 1 or when Fofc(s) = linit^oo Fofc(^) for some of the Fq^'s. 

302 We now introduce the following simple estimator for Fofc(s): 
303 

304 Fnkis) = Nkis)/N{s), k = 1, . . . ,K,s £ S, 

where we set 0/0 = 0. This estimator is very simple, in the sense that Fnk does not obey mono- 
tonicity constraints and uses only the kth component of the A-vector. Lemma 1 below states that 
is the maximum hkelihood estimator for Fq if the monotonicity constraints on the cumulative 
incidence functions are discarded. Next, Lemma 2 establishes that for any regular point s £ S, 
Fn{s) = Fn(s) = Fn(s) with probability tending to one as n — > 00. Hence, at such points the 
limiting distributions of F„(s) and Fn{s) equal the limiting distribution of This yields 

asymptotic normality of Fn{s) and Fn{s) at regular points, as stated in Theorem 3. All proofs 
are deferred to Section 7. 

313 

314 Lemma 1. Let = {F = {Fi,...,Fk) : Fk{s) > for k = 1, . . . , K and < Ifor 

315 all s e S}. Then In(Fn) > ln{F) for all F £ 7"^, and /„(F„) > /„(F) for all F G J"^ such 

316 that F{s) / Fn{s) for some s £S with N{s) > 0. 
317 

2 Lemma 2. For any regular point s £ S in the discrete model, 

319 pr{Fn{s) = Fn{s) = Fn{s)} ^ I as n ^ 00. 

Theorem 3. For any regular point s £ S in the discrete model, 

( F„i(s) -Foi(s) 

323 n^'\Fn{s) - F^{s)} = v}'^ 

324 
325 

326 is asymptotically normal with mean zero and covariance matrix V{s), where V{s) is a K x K 

327 matrix with entries 

{V{s)]k,i = [F(,k{s)l{k = l}-Fok{s)Fo,{s)\/G{{s]), £ {I, . . . ,K]. 

329 

330 For any finite collection of regular points S, the stacked vector n^/^{F„(si) - 

331 Fo(si), . . . , Fn{sp) — Fo(sp)} is asymptotically normal with mean zero and block diagonal co- 

332 variance matrix with blocks V{si), . . . , V{sp). Consistent estimators for the elements ofV{s), 

333 s £ S, are 

{Vn{s)}k/ = [Fnk{s)l{k = l}-Fnk{s)Fni{s)]/N{s), k,t£{l,...,K}. 
336 The same results hold for the naive estimator, that is, when Fn is replaced by Fn- 



320 
321 
322 
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If {m(/)}_ G M let /_ = /[{m(/)}_], and if G M let /+ = I[{m{I)}+]. We say 

that / G X is a regular interval if HQk{I) = for all /c = 1, . . . , or the following two condi- 
tions hold: (i) if m{I) ^ minf then G M and for each k = 1, . . . , K either HQk{I^) < 
Hok{I) or HQk{I) = and (ii) if m{I) ^ m-sup then {m(I)}+ G Ai and for each A; = 1, . . . , ii' 
either Hok{I) < %(/+) or Hok{I) = 0. 

Analogously to F„ in the discrete model, we define a simple estimator in the grouped model: 



337 Remark 2. If FQk{s) > 0/or all k = 1, . . . ,K and Y.k=i ^Okis) = 1, then the matrix V{s) 

338 is positive-semidefinite with rank K — 1. If FQk(s) = or -Fofc(s) = 1> then the kth row and the 

339 j^ifi column ofV{s) are zero vectors, and the corresponding limiting distributions ofFnk{s) and 

34^ Fnk{s) should be interpreted as degenerate distributions consisting of a point mass at zero. More 

34^ details can be found in the proof of Theorem 3. 

342 

343 3-3. Limiting distributions in the grouped model 

344 Denote the infimum and supremum of M by rrii^f and mgun, and define 
345 

34g {m(/)}_ = sup{x e M : X < m{I)} for/ G Xwithm(/) / m-mt, 

347 {m(/)}+ = inf{x £ M : x > m{I)} for / G Xwithm(/) / m-sup- 

348 
349 
350 
351 
352 
353 
354 

355 Hnk{I) = Mk{I)/M{I), k = l,...,K,I el. 

356 

^^rj The proofs and results for the discrete model can now be translated directly to the grouped model, 

by replacing regular points s G 5 by regular intervals / G X, Fn{s) by Hn{I), Fn{s) by Hn{I), 

359 ^n(s) by ^„,(/), Fo{s) by Ho{I), and Nk{s) by Mfc(/) for /fc = 1, . . . , /^ + 1. We therefore 

only give the main result in Theorem 4, without proof. 

361 Theorem 4. For any regular interval I el in the grouped model, 

362 
363 

364 n'/'{H„iI)-Ho{I)} = n'/^ 

365 
366 

3^-7 is asymptotically normal with mean zero and covariance matrix U (/), where U{I) is a K x K 

3g§ matrix with entries 

369 {[/(/) = [//ofe(/)l{fc = £}- //ofe(/)//oK/)] /G(/), kJe{l,...,K}. 

37 1 Moreover, for any finite collection of regular intervals Ii, . . . ,Ip, the stacked vector 

372 n^/^{-ff„(/i) — Hq{Ii), . . . , Hn{Ip) — Ho{Ip)} is asymptotically normal with mean vector zero 

373 and block diagonal covariance matrix with blocks C/(/i), . . . , U{Ip). Consistent estimators for 

374 the elements of U (/), I £ I, are 

375 . r - . . 1 

376 {Un{I)}k,i= [Hnk{I)l{k = £} - Hnk{I)Hni{I)\ /M{I), k,ee{l,...,K}. 
377 

The same results hold for the naive estimator, that is, when Hn is replaced by Hn. 

379 As in Theorem 3, a degenerate limiting distribution should be interpreted as point mass at zero. 
380 

381 3-4. Theoretical motivation for the grouped model 

382 The asymptotic results provide a theoretical motivation for the grouped model, since a contra- 

383 diction arises with respect to rates of convergence when the grouping of observation times is ig- 

384 nored. To see this, consider the menopause data and the HIV data, and suppose that the grouping 



/ Hni{I) - Hoi{I) 
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385 of observation times is ignored, meaning that the recorded observation times are interpreted as 

386 exact observation times. This assumption was made in previous analyses of the menopause data 

387 (see Jewell & Kalbfleisch (2004); Jewell et al. (2003); Krailo & Pike (1983); Maathuis (2006)). 

388 Under this assumption, the discrete model is most appropriate for the menopause data, since 

389 there are numerous ties in the recorded observation times (see Section 5-2). On the other hand, 

390 the smooth model seems most appropriate for the HIV data, since this data set contains very 

391 few ties in the recorded observation times (see Section 5-3). This would imply that the local rate 

392 of convergence of the maximum likelihood estimator and the naive estimator at the recorded 

393 observation times is n^/^ for the menopause data, while it is n^^^ for the HIV data. 

394 In reality, however, the observation times were continuous in both data sets, and they were 

395 rounded in the recording process. In the menopause data, this rounding was substantial, into 1- 

396 year or 5-year intervals, while in the HIV data it was minimal, into 1-day intervals. Since round- 

397 ing implies discarding information, it seems impossible that more rounding, as in the menopause 

398 data, leads to a faster local rate of convergence at the recorded observation times. This appar- 

399 ent contradiction can be resolved by modeling the grouping of the observation times. For the 

400 grouped model, rounding or grouping of the observation times indeed yields a faster rate of con- 

401 vergence, but not for the cumulative incidence functions at the recorded observation times, but 

402 for weighted averages of the cumulative incidence functions over the grid cells. These weighted 

403 averages are smooth functional of the cumulative incidence functions and thus can be estimated 

404 at rate n^/^ (Jewell et al. (2003), Maathuis (2006, Chapter 7)). 
405 

406 

407 4. Construction of pointwise confidence intervals 

408 4-1. Confidence intervals in the discrete and grouped models 

409 In the discrete and grouped models, the large-sample behavior of the maximum likelihood 

410 estimator and the naive estimator at regular points or intervals is standard, and hence confidence 

41 1 intervals can be constructed by any standard method, for example using the asymptotic normal 

412 distribution or the bootstrap. For instance, let s G 5 be a regular point in the discrete model. 

413 Then an asymptotic (1 — a) 100% confidence interval for Fofc(s) is 

414 . , . , 

415 Fnk{s) ± n-V2^i_„/2[{K(s) W]'/', 

where 21-0/2 is the (1 — a/2)-quantile of the standard normal distribution. Similarly, consid- 
ering a regular interval / G Z in the grouped model, an asymptotic (1 — a)100% confidence 
interval for Hndl) is 

420 F„,,(/)±n-i/2zi_,/2[{f>n(/)W]'^'- (9) 
421 

422 4-2. Confidence intervals in the smooth model 

423 In the smooth model, the large-sample behavior of the maximum likelihood estimator and 

424 the naive estimator is nonstandard, making the construction of confidence intervals less straight- 

425 forward. In principle, one can construct confidence intervals using the limiting distribution of 

426 the maximum likelihood estimator, but this approach entails several difficulties. First, the limit- 

427 ing distribution involves parameters from the underlying distributions that need to be estimated. 

428 Moreover, Theorems 1-7 and 1-8 of Groeneboom et al. (2008c) suggest that these parameters 

429 cannot be separated from the limiting distribution, in the sense that it seems impossible to write 

430 the limiting distribution as cZ, where c is a constant depending on the underlying distribution 

43 1 and Z is a universal limit. Hence, one would need to simulate the limiting distribution on a case 

432 by case basis. Conducting such simulations is non-trivial (Groeneboom & Wellner (2001)). 
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433 One might also consider the nonparametric bootstrap to construct confidence intervals based 

434 on the maximum likelihood estimator or the naive estimator. However, it is likely the bootstrap 

435 is inconsistent in this setting, given recent results of Kosorok (2008) and Sen et al. (2010) on 

436 inconsistency of the bootstrap for the closely related Grenander estimator. 

437 Subsampling (Politis & Romano, 1994), a variant of the bootstrap, produces asymptotically 

438 valid confidence intervals under very minimal assumptions, and can be applied to construct 

439 asymptotically valid confidence intervals for the cumulative incidence functions based on the 

440 maximum likelihood estimator or the naive estimator. A drawback of subsampling is that it re- 

441 quires a tuning parameter, the subsample size, which is difficult to choose in practice. 

442 Finally, one can consider likelihood ratio confidence intervals based on the naive estimator. 

443 Although the naive estimator has been shown empirically to be less efficient than the maximum 

444 likelihood estimator (Groeneboom et al., 2008c, Figure 3), it has the advantage that its large sam- 

445 pie behavior is simpler. For a fixed failure cause, the limiting distribution of the naive estimator 

446 is identical to the limiting distribution of the maximum likelihood estimator for current status 

447 data without competing risks (Groeneboom et al., 2008c, Theorem 1-6). Hence, the likelihood 

448 ratio theory of Banerjee & Wellner (2001) applies, and confidence intervals can be constructed 

449 by inverting likelihood ratio tests (Banerjee & Wellner, 2005). These confidence intervals have 

450 the appealing property that they do not require estimation of parameters from the underlying dis- 

45 1 tribution, nor any tuning parameters. Simulation studies by Banerjee & Wellner (2005) showed 

452 that for current status data without competing risks, likelihood ratio based confidence intervals 

453 are typically preferable over confidence intervals based on the limiting distribution or subsam- 

454 pling. In the smooth model, we therefore recommend using likelihood ratio confidence intervals 

455 based on the naive estimator. 
456 

457 

458 ^ ^ 

5. Examples 

4gQ 5-1. Simulation 

461 It is not clear how well the asymptotic distributions of Sections 3-2 and 3-3 approximate 

462 the finite sample behavior of the estimators, especially for grids that are dense relative to 

463 n. We therefore conducted a simulation study, using the following discrete model: pr{Y = 

464 1) = 0-6, pr{Y = 2) = 0-4, X\Y = 1 ~ Gamma(5, 3), and X\Y = 2 ~ Gamma(9, 2), where 

465 Gamma(a, b) denotes a Gamma distribution with shape parameter a and scale parameter b. The 

466 distribution of C was taken to be uniform on one of the following grids: (i) {10, 20, 30}, called 

467 "gap 10", (ii) {6,8,..., 34}, called "gap 2", (iii) {5-5, 6-0, . . . , 35-0}, called "gap 0-5", and (iv) 

468 {5-1, 5-2, . . . , 35-0}, called "gap 0-1". For each of the four resulting models, 1000 data sets of 

469 sample size n = 1000 were simulated. For each data set we computed symmetric 95% asymp- 

470 totic confidence intervals for the cumulative incidence functions at the points = (10, 20, 30), 

47 1 based on the normal distribution and the bootstrap, using both the maximum likelihood estimator 

472 and the naive estimator. 

473 The results for Fqi are shown in Figure 1. The results for F02 are similar, and are therefore 

474 omitted. Confidence intervals based on the maximum likelihood estimator behave very similarly 

475 to confidence intervals based on the naive estimator, while there is a large difference between 

476 normal and bootstrap based confidence intervals for the denser grids. The increase in width of 

477 the normal based confidence intervals for the denser grids is caused by the decrease of nG{{to}), 

478 which can be viewed as the expected effective sample size for the simple estimator F„ at Iq. As 

479 a result, the variance of the asymptotic normal distribution increases by a factor 5 or 6 between 

480 each pair of successive grids. The empirical variance of the estimators, on the other hand, in- 
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creases somewhat for the denser grids, but not by much, due to the stabihzing effect of the 
monotonization that takes place in the maximum likelihood estimator and the naive estimator. 
As a result, the normal based confidence intervals give substantial over-coverage. This break- 
down of the normal limit is already apparent for the larger time points in the relatively coarse 
grid "gap 2", which has an average of 67 observation times per grid point. The bootstrap variance 
was found to be a better approximation of the empirical variance of the estimators, suggesting 
the use of bootstrap intervals over asymptotic normal intervals in practice. However, the under- 
coverage of the bootstrap intervals at to = 10 becomes more substantial as the grids become 
denser. This points to inconsistency of the bootstrap for very dense grids, which is in line with 
the theory discussed in Section 4-2. 
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Fig. 1. Simulation: Coverage and average width of tiie four 95% confidence intervals for Foi(to) as a function of 
to- The confidence intervals were based on the normal distribution (o) and the bootstrap (a), using the maximum 
likelihood estimator (solid line) and the naive estimator (dashed line). The bootstrap confidence intervals are based 

on 750 bootstrap samples. 



5-2. Menopause data 

We consider data on 2423 women in the age range 25-59 years from Cycle I of the Health Ex- 
amination Survey of the National Center for Health Statistics (MacMahon & Worcestor, 1966). 
Among other things, these women were asked to report (i) their current age, (ii) whether they 
were pre- or postmenopausal, and (iii) if they were postmenopausal, the age and cause of 
menopause, where the cause could be natural or operative. Since MacMahon & Worcestor (1966) 
found marked terminal digit clustering in the reported ages of menopause, Krailo & Pike (1983) 
excluded these from the analysis. The remaining information can be viewed as current status data 
with competing risks. Nonparametric estimates of the cumulative incidences of the two types of 
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menopause were computed by Jewell et al. (2003), Jewell & Kalbfleisch (2004) and Maathuis 
(2006) under the assumption that the recorded ages of the women at the time of the interview 
were exact. However, this was not the case. Instead, the ages were grouped into the intervals 
(25, 30], (30, 35], (35, 36], (36, 37] ... , (58, 59] and recorded as the midpoints of these intervals, 
yielding 26 age groups with a minimum of 45 and an average of 93 observations per age group. 
This is comparable to "gap 2" in our simulation study (see Section 51). 

We add to the previous analyses of these data in two ways: we use the grouped model, which is 
clearly appropriate for these data, and we provide confidence intervals. Figure 2 shows the max- 
imum likelihood estimator and the naive estimator for the weighted averages of the cumulative 
incidence functions, together with 95% normal and bootstrap confidence intervals based on the 
maximum likelihood estimator. As in our simulation study, the confidence based on the normal 
distribution are wider than those based on the bootstrap. 

5-3. HIV data 

The Bangkok Metropolitan Administration injecting drug users cohort study 
(Kitayapom et al., 1998; Vanichseni et al., 2001) was established in 1995 to better under- 
stand HIV transmission and to assess the feasibility of conducting a phase III HIV vaccine 
efficacy trial in an injecting drug users population in Bangkok. We consider data on 1366 
injecting drug users in this study who were screened from May to December 1996 and who 
were under 35 years of age. Among this group, 393 were HIV positive, with 1 14 infected with 
subtype B, 238 infected with subtype E, 5 infected by another or mixed subtype, and 36 infected 
with missing subtype. The subjects with other, mixed, or missing subtypes were grouped in the 
category "other". All ages were recorded in days, leading to a small number of ties: among the 
1366 subjects, there were 1212 distinct ages, and the mean number of observations per distinct 
age was 1.13. In light of this, we analyze these data using the smooth model. Figure 3 shows 
the maximum likelihood estimator and the naive estimator for the subtype-specific cumulative 
incidence of HIV, together with 95% likelihood ratio confidence intervals based on the naive 
estimator. 
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Fig. 2. Menopause data: The maximum likelihood estimator _ff„ (o) and the naive estimator _ff„ ( x ) for the weighted 
averages of the cumulative incidence of operative and natural menopause over the age groups. The estimators are 
plotted at the midpoints of the age groups which are indicated by the dotted vertical lines. The two solid vertical line 
segments in each age group are 95% asymptotic confidence intervals based on the maximum likelihood estimator: 
the left line segment is based on the normal approximation (9) and the right line segment is a symmetric bootstrap 

confidence interval based on 1000 bootstrap samples. 
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Fig. 3. HIV data: The maximum likelihood estimator _F„ (dashed) and the naive estimator F„ (solid) for the cumulative 
incidence of HIV subtypes B and E as a function of age, using the smooth model. The solid vertical lines represent 
95% pointwise confidence intervals at times 16, ... , 34, based on the likelihood ratio method for the naive estimator. 



6. Observation time distribution or grouping dependent on n 

There are interesting connections between our work and unpublished work of Tang, Banerjee 
and Kosorok (see http://www.stat.lsa.umich.edu/'--'moulib/jsm09csd.pdf), who studied current 
status data without competing risks when the observation time distribution depends on the sample 
size n. More precisely, let X be a random event time with distribution Fq and let (7^"^^ be a ran- 
dom observation time with distribution G^'^'> , where G*^"^ is a discrete distribution on an equidis- 
tant grid with spacings n~'' for some 7 € (0, 1). Without loss of generality, assume this grid is 
on [0, 1]. Consider the nonparametric maximum likelihood estimator F„ for Fq based on n in- 
dependent and identically distributed observations of (C'") , A^")), where A^") = 1{X < C^^^}. 
Let to £ (0, 1) be a time point of interest, and let t„ be the largest support point of G^"^ smaller 
than to- Assuming Fq satisfies certain smoothness conditions in a neighborhood of to. Tang et al. 
found that the limiting distribution of the maximum likelihood estimator depends crucially on 7. 
For 7 < 1/3 the limiting distribution of n(^~''')/^{F„(t„) — Fo(in)} is normal with mean zero 
and variance Fo(to){l — -Fo(to)}. Hence, for such sparse grids, the maximum likelihood estima- 
tor behaves as in the discrete model, up to a different rate of convergence. For 7 > 1/3, on the 
other hand, the limiting distribution of n^/^{F„(to) — i*b(*o)} is determined by the slope of the 
convex minorant of a Brownian motion process plus parabolic drift, showing that the maximum 
likelihood estimator behaves as in the smooth model. The case 7=1/3 forms the boundary 
between these two scenarios and yields a new limiting distribution. 

Combining our work with that of Tang et al. yields two extensions. First, consider a grouped 
model for current status data without competing risks, where the grouping intervals depend on 
n. More precisely, let X be an event time with distribution Fq, let C be an observation time 
with distribution G, and let A = 1{X < C}. Assume the support of G is [0, 1], and let I„ be the 
set of intervals formed by the grid cells of an equidistant grid on [0, 1] with spacings n~'^ for 
some 7 E (0, 1). Assume that the observation time C is rounded to the midpoint of the interval 
in which it falls, and denote this rounded observation time by Z)("). One can now consider the 
nonparametric maximum likelihood estimator for Fq based on n independent and identically 
distributed copies of {D^'^\K). Since the likelihood in this grouped model can be written in 
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625 exactly the same form as the likelihood in the discrete model, and since also the constraints on 

626 the two optimization problems are equivalent, the work of Tang et al. should carry over to this 

627 model, with the only difference that everything should written in terms of weighted averages of 

628 Fq over the grid cells. Second, consider the discrete model for current status data with competing 

629 risks, where the support of G depends on n. Then the results of Tang et al. should carry over to the 

630 naive estimator Fnk, since this estimator can viewed as a maximum likelihood estimator based 

63 1 on reduced current status data without competing risks. The same holds for the naive estimator 

632 Hnk in the grouped model when the grouping intervals depend on n. 

633 Finally, one can consider the limiting behavior of the maximum likelihood estimator for cur- 

634 rent status with competing risks in the discrete and grouped models when the support of G or the 

635 grouping intervals depend on n. The theory is much more involved in this case, and this problem 

636 is outside of the scope of this paper. For sparse grids with 7 < 1/3, we conjecture a normal limit 

637 as in Theorems 3 and 4, but with rate of convergence n^^^"^)/^. For dense grids with 7 > 1/3, we 

638 conjecture the limiting distribution of the smooth model (see Groeneboom et al. (2008c, Theo- 

639 rem 1-8)). Finally, for the boundary case 7 = 1/3, we conjecture that a new limiting distribution 

640 will appear. 
641 

642 
643 



7. Proofs 



644 Proof of Lemma 1. Due to the absence of monotonicity constraints on J^^, the maximizer 

645 of ln{F) over can be determined separately for each s G <S. Thus, fix s € 5, and define 

646 1^{F, s) = Y.k=i ^k{s) \og{Fk{s)}. Moreover, define /C = {A; e {1, . . . , K + 1} : iVfc(s) > 

647 0} and /C<^ = {1, . . . , + 1} \ /C. First, suppose /C = 0. Then s) = for any choice of 

648 Fk{s), k = 1, . . . ^K, and hence Fn{s) is a maximizer of ln{F, s). Next, suppose /C 7^ 0, or 

649 equivalently, N{s) > 0. Then any maximizer of ln{F,s) subject to the constraint F^{s) < 1 
must set -Ffc(s) = for /c G /C^. Hence, for k € IC^ the maximizer is unique and equals Fnk{s). 
If |/C| = 1, ln{F, s) contains only one non-zero term, and it is clear that the corresponding Fk{s) 
should be set to 1, which equals Fnk{s). If |/C| > 1, we define k* = max/C. Then Nk*{s) = 
N{s) — ^keK.\{k*} ^k{s) and any maximizer of ln{F, s) over must satisfy Fk*{s) = 1 — 

655 J2keJC\{k*} Pkis). Hence, we can write s) = J2keic\{k*} ^k{s) log{Fk{s)} + {N{s) - 

656 J2keJC\{k'} ^k{s)} log{l - J2keic\{k*} Fkis)}. This function is strictly concave in Fk{s) for 

657 k £ jC\ {k*}. The unique maximizer can be determined by solving dln{F, s)/dFk{s) = for 
65§ k £ lC\{k*}, which yields k £ K. 

659 Proof of Lemma 2. Let s G 5 be a regular point in the discrete model. We first consider the 

maximum likelihood estimator for the "basic case" where s ^ {sinf,Ssup}- Let /C+ = {A; G 
661 {l,...,K}: Fok{s) > 0}. For /c G {1, . . . , K} \ /C+, we have Nkls) = 0. Hence, the corre- 

sponding Ffc(s)'s do not contribute to the likelihood and we directly obtain that the correspond- 
ing estimators satisfy Fnk{s) = Fnk{s) = 0. So we are done if /C+ = 0. Otherwise, we are left 
to show pr[r\k(zfc+{Fnkis) = Fnk{s)}] — 7> 1 as n — )• 00. Define the events 



663 
664 
665 

666 Anis) = HkeK+il^nkis-) < Kkis) < F„fc(s+)}, 

23 ^"(^) = ^keK+u{K+i}{Nk{s) > 0}. 

668 

669 The assumptions on s imply Foa,.(s_) < FQkis) < Fofe(s+) for k G /C"*". By combining this 

670 with the consistency of F„ (Theorem 1), it follows that pr{An{s)} — 1 as n ^ 00. More- 

671 over, the law of large numbers, G{{s}) > 0, FQk{s) > for /c G /C+, and Fo+(s) < 1 im- 

672 ply pr{Bn{s)} ^ 1 as n — > 00. Hence, pr{An{s) n i?„(s)} — > 1 as n — > 00, and the proof 
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690 
691 
692 
693 
694 



673 for the basic case can be completed by showing that the event n implies 

674 ^keK+{^nk{s) = Fnk{s)}- We do this using contraposition. Thus, suppose {A„(s) ni?„(s)} 

675 holds. This implies fc* = A' + 1 in the proof of Lemma 1, and it follows that Fnk{s), k G /C^, 

676 is the unique solution of dln{F) / dFk{s) = 0, € /C+. Now assume there is a j G /C+ such 
that Fnj{s) 7^ Fnj{s). Then there must be a A; G /C+ such that dln{F)/dFi{s)\p^^^^ / 0. Let 

£7 G {-l,+l}bethesignof 9Z„(F)/aFg(s)|^^(^),anddefineF^^"'(s) = F„(s) + 70-6^, where 

580 6fc is unit vector in with a 1 at the /cth entry. Then for 7 > sufficiently small, replac- 

681 ing by F^'^'^is) increases the log likelihood. Moreover, this replacement does not violate 

682 the constraints of Tk, as for 7 > sufficiently small we have F^j,{s-) < F'^^^{s) < F„^(s+) 

683 and ^^+""(5) < Fn+(s+) < 1. This shows that F^ cannot be the maximum likelihood estimator, 

684 which is a contradiction. 

685 If s 7^ Sinf and s = Ssup> we distinguish two cases. If Fo+(s) < 1, the proof of the basic case 

686 goes through with the only change that A„(s) = nkeic+{Pnk{s-) < -F„fc(s)} n {-F„+(s) < 1}. 

687 If Fo+(s) = 1, then Nk+i{s) = and 1 - F+{s) does not contribute to the log likelihood. 
Hence, the maximum likelihood estimator must satisfy Fn+{s) = 1 and this equals 
if N{s) > 0. If K = 1, this implies pr{Fn{s) = — > 1 as n — > 00, so that we are 
done. If A' > 1, we use the proof for the basic case with the following changes. We define 
An{s) = nfcg^+{F„fc(s_) < Fnk{s)} and B„(s) = nk^^+{Nk{s) > 0}. As before, we have 
pr{An{s) n Bn{s)} — )• 1 as n — > 00. We will therefore show that {An{s) PI Bn{s)} implies 
^keK.+ {^nkis) = Fnk{s)}, using Contraposition. Thus, assume {An{s) ni?„(s)} holds. This 
implies k* = max/C"'' in the proof of Lemma 1, meaning that Fnk, k G /C"*", are found bj- solv- 

696 ing dln{F)/dFk{s) = for A; G /C+ \ {k*} and setting Fnk- {s) = I - T.k&K+\{k*} Fnk{s). 

697 Assume Fnk{s) 7^ Fnkis) for some k G /C^. Then there must be a A; G /C"*" \ {k*} such 

698 dlniF)/dFj:{s)\p^^^^ / 0. Define d as the sign of and define = 

699 Fn{s) + 'jaejj. — 'jack*- Then for 7 > sufficiently small, replacing Fn{s) by F^^^{s) in- 

700 creases the log likelihood. Moreover, this replacement does not violate the constraints of 

701 J^K, as for 7 > sufficiently small we have < -F^|"'(s), Kk-is.) < F^^f{s), and 
F^^{s) = Fn-\-{s) = 1. Hence, F„ cannot be the maximum likelihood estimator, and we have 
again derived a contradiction. 

704 

The proof for the maximum likelihood estimator is completed by considering two remain- 
ing special cases. If s = Sinf and s ^ Sgup, then the proof for the basic case goes through with 
707 the only change that = n^g^+fO < Fnk{s) < Fnk{s+)}. If s = Sjnf = s^up, then [51 = 1 

YQg and monotonicity constraints do not play any role in the maximum likelihood estimator, so that 

rjQg Fn = Fn follows immediately. 

rj^Q The proof for the naive estimator follows directly from the proof for the maximum likelihood 

Y J y estimator by taking K = 1. To see this, let A; G {1, ... , K} and recall that the naive estimator 

is the maximum likelihood estimator for the reduced current status data (A^, Cj), i = 1, . . . ,n. 

713 Hence, the proof for the maximum likelihood estimator implies pr{Fnk{s) = F^k^{s)} — )• 1 as 

714 n — )• cxo, where F^'ff is the simple estimator based on the reduced data. The proof is completed 

715 by observing that F^f = Fnk- 
716 

717 
718 

719 Proof of Theorem 3. Because of Lemma 2, it is sufficient to derive the hmiting distribution 

720 of Fn- Let A; G {1, . . . , K} and s G 5. Since pr{N{s) > 0} — 1 as n — > 00, we can assume 
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721 N{s) > 0. We first consider the case < Fok{s) < 1. Then 
722 

723 n'/^Fnk{s) - Fok{s)} = N{s)-'n'/^{Nk{s) - Fok{s)N{s)} 
724 

725 = Nis)-'n-'/^ ^{AI - Fo,(s)}l{C, = s}, 

726 i=i 
111 

728 and the result follows from N{s) — >p G({s}), the multivariate central limit theorem, and Slut- 

729 sky's lemma (e.g., van der Vaart (1998, Lemma 2-8 (iii))). 

730 If -Fofc(s) = 0, then Nj.{s) = and hence F„fc(s) = = -Fofc(s) always. Similarly, if Fofc(s) = 

731 1, we have Nk{s) = N{s) and hence F^kis) = 1 = FQk{s) whenever N{s) > 0. These results 

732 are in agreement with the theorem, since in these cases {V{s)}k,k = 0, leading to a degenerate 

733 limiting distribution that should be interpreted as a point mass at zero. It can be easily verified 

734 that the off-diagonal elements {V{s)}k,j = for j € {1, ... , K}, j ^ k, are also correct in these 

735 cases. 
736 
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